Calculating distance measure for clustering in multi-relational settings

نویسنده

  • Olegas Niakšu
چکیده

The paper deals with a distance based multi-relational clustering application in a real data case study. A novel method for a dissimilarity matrix calculation in multirelational settings has been proposed and implemented in R language. The proposed method has been tested by analyzing publications related to data mining subject and indexed in the medical index database MedLine. Clustering based on partitioning around medoids was used for the semi-automated identification of the most popular topics among the MedLine publications. The algorithm implements greedy approach and is suitable for small data sets with a limited number of 1:n relational joins.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Gray Relational Analysis and Taguchi Technique in Solving Multi-objective Problems for Turning Operation of Austenitic Stainless Steel

In this study, the application of gray relational analysis (GRA) and Taguchi method in multi-criteria process parameters selection of turning operation has been investigated. The process responses under study are material removal rate (MRR) and surface roughness (SR); in turn, the input parameters include cutting speed, feed rate, depth of cut and nose radius of the cutting tool. The proposed a...

متن کامل

A Hybrid Grey based Two Steps Clustering and Firefly Algorithm for Portfolio Selection

Considering the concept of clustering, the main idea of the present study is based on the fact that all stocks for choosing and ranking will not be necessarily in one cluster. Taking the mentioned point into account, this study aims at offering a new methodology for making decisions concerning the formation of a portfolio of stocks in the stock market. To meet this end, Multiple-Criteria Decisi...

متن کامل

Robust Extension of FCMdd-based Linear Clustering for Relational Data using Alternative c -Means Criterion

Relational clustering is an extension of clustering for relational data. Fuzzy c-Medoids (FCMdd) based linear fuzzy clustering extracts intrinsic local linear substructures from relational data. However this linear clustering was affected by noise or outliers because of using Euclidean distance. Alternative Fuzzy c-Means (AFCM) is an extension of Fuzzy c-means, in which a modified distance meas...

متن کامل

خوشه‌بندی خودکار داده‌های مختلط با استفاده از الگوریتم ژنتیک

In the real world clustering problems, it is often encountered to perform cluster analysis on data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. In addition, traditional methods, for example, the K-means algorithm, usually ask the user to provide the number of clusters. In this...

متن کامل

Improved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring

In data mining, clustering is one of the important issues for separation and classification with groups like unsupervised data. In this paper, an attempt has been made to improve and optimize the application of clustering heuristic methods such as Genetic, PSO algorithm, Artificial bee colony algorithm, Harmony Search algorithm and Differential Evolution on the unlabeled data of an Iranian bank...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013